Do Semidefinite Relaxations Solve Sparse Pca up to the Information Limit ?

نویسندگان

  • ROBERT KRAUTHGAMER
  • BOAZ NADLER
  • DAN VILENCHIK
چکیده

Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components? We study this question for a singlespike model with an 0-sparse eigenvector, in the asymptotic regime as dimension p and sample size n both tend to infinity. Amini and Wainwright [Ann. Statist. 37 (2009) 2877–2921] proved that for sparsity levels k ≥ (n/ logp), no algorithm, efficient or not, can reliably recover the sparse eigenvector. In contrast, for k ≤ O(√n/ logp), diagonal thresholding is consistent. It was further conjectured that an SDP approach may close this gap between computational and information limits. We prove that when k ≥ (√n), the proposed SDP approach, at least in its standard usage, cannot recover the sparse spike. In fact, we conjecture that in the single-spike model, no computationally-efficient algorithm can recover a spike of 0sparsity k ≥ (√n). Finally, we present empirical results suggesting that up to sparsity levels k = O(√n), recovery is possible by a simple covariance thresholding algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Semidefinite Programs with Near-Linear Time Complexity

Some of the strongest polynomial-time relaxations to NP-hard combinatorial optimization problems are semidefinite programs (SDPs), but their solution complexity of up to O(nL) time and O(n) memory for L accurate digits limits their use in all but the smallest problems. Given that combinatorial SDP relaxations are often sparse, a technique known as chordal conversion can sometimes reduce complex...

متن کامل

Approximating Semidefinite Packing Programs

In this paper we define semidefinite packing programs and describe an algorithm to approximately solve these problems. Semidefinite packing programs arise in many applications such as semidefinite programming relaxations for combinatorial optimization problems, sparse principal component analysis, and sparse variance unfolding techniques for dimension reduction. Our algorithm exploits the struc...

متن کامل

On the Worst-Case Approximability of Sparse PCA

It is well known that Sparse PCA (Sparse Principal Component Analysis) is NP-hard to solve exactly on worst-case instances. What is the complexity of solving Sparse PCA approximately? Our contributions include: 1. a simple and efficient algorithm that achieves an n-approximation; 2. NP-hardness of approximation to within (1 − ε), for some small constant ε > 0; 3. SSE-hardness of approximation t...

متن کامل

On the Approximability of Sparse PCA

It is well known that Sparse PCA (Sparse Principal Component Analysis) is NP-hard to solve exactly on worst-case instances. What is the complexity of solving Sparse PCA approximately? Our contributions include: 1. a simple and efficient algorithm that achieves an n−1/3-approximation; 2. NP-hardness of approximation to within (1− ε), for some small constant ε > 0; 3. SSE-hardness of approximatio...

متن کامل

On semidefinite relaxations for the block model

The stochastic block model (SBM) is a popular tool for community detection in networks, but fitting it by maximum likelihood (MLE) involves an infeasible optimization problem. We propose a new semi-definite programming (SDP) solution to the problem of fitting the SBM, derived as a relaxation of the MLE. Our relaxation is tighter than other recently proposed SDP relaxations, and thus previously ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015